Clustering Using Cemetery Organization Behavior of Ants
نویسندگان
چکیده
Clustering is the unsupervised classification of patterns (data items, observations or feature vectors) into groups (clusters). Clustering problem has been addressed by the researchers of many disciplines in different contexts. Due to the escalating amount of data available online, the World Wide Web has become one of the most precious resource for information retrievals and knowledge discoveries. Web mining technologies are the right solutions for knowledge discovery on the Web. In this paper, we focus on web page clustering based on their content. A web page clustering system is valuable in web search for grouping search results into strongly related sets of documents. It can improve similarity search by focusing on sets of pertinent documents. At the same time, as the large variety of noisy information is embedded in web pages, web page clustering is much more intricate than pure-text clustering. This paper addresses web page clustering problem through the technique inspired by cemetery organization behavior of ants. Technique proposed by us begins by reducing the dimensionality of index of web pages with the application of Latent Semantic Indexing (LSI). Web pages are then transformed to two dimensional grid space using cemetery organization behavior of ants. Web pages represented in this two dimensional grid space are finally clustered using k-means algorithm. Paper also demonstrates impact of dimensionality reduction by means of LSI and distance measure on web page clustering results is also demonstrated.
منابع مشابه
The phase-ordering kinetics of cemetery organization in ants
The clustering of dead bodies by ants is simulated, using a cellular automaton model, the rules of which are carefully derived from experiments. Starting from a random spatial distribution of corpses, a cemetery organizes itself into clusters of corpses. The dynamics of clustering can be compared to the phase-ordering kinetics of a bidimensional idealized magnetic system with scalar conserved o...
متن کاملACO-based document clustering method
Ant systems are flexible to implement and give possibility to scale because they are based on multi agent cooperation. The aim of this publication is to show the universal character of that solution and potentiality in implementing it in wide areas of applications. The increase of demand for effective methods of large document collections management is a sufficient stimulus to place the researc...
متن کاملApplication of Ant-based Template Matching for Web Documents Categorization
The self-organization behavior exhibited by ants may be modeled to solve real world clustering problems. The general idea of artificial ants walking around in search space to pick up, or drop an item based upon some probability measure has been examined to cluster a large number of World Wide Web (WWW) documents. However, this idea is extended with the direct application of template matching wi...
متن کاملPhase-ordering kinetics of cemetery organization in ants
Eric Bonabeau, Guy Theraulaz, Vincent Fourcassié, and Jean-Louis Deneubourg Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501 Laboratoire d’Ethologie et de Psychologie Animale, CNRS, UMR 5550, Université Paul Sabatier, 118 route de Narbonne, 31062 Toulouse Cédex, France Unit of Theoretical Behavioural Ecology, Service de Chimie-Physique, CP 231, Université Libre de Bruxelles, ...
متن کاملAn Efficient Ant Algorithm for Swarm-Based Image Clustering
A collective approach to resolve the segmentation problem was proposed. AntClust is a new ant-based algorithm that uses the self-organizing and autonomous brood sorting behavior observed in real ants. Ants and pixels are scatted on a discrete array of cells represented the ants’ environment. Using simple local rules and without any central control, ants form homogeneous clusters by moving pixel...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014